recognition rate
Synthetic Error Injection Fails to Elicit Self-Correction In Language Models
Wu, David X., Kapur, Shreyas, Sahai, Anant, Russell, Stuart
Reinforcement learning has become the dominant paradigm for eliciting reasoning and self-correction capabilities in large language models, but its computational expense motivates exploration of alternatives. Inspired by techniques from autonomous driving and robotics, we investigate whether supervised learning with synthetic error injection can induce self-correction abilities in language models. Our approach inserts artificial errors into reasoning chains, masks them, and supervises the model to recognize and correct these mistakes. Despite the intuitive appeal of this method, we find that it fails to significantly improve performance even on simple synthetic tasks across multiple models. Moreover, even when the model catches its own error, it often parrots the original mistake. We find that the distribution shift of synthetic errors to on-policy errors significantly degrades the error-correction capabilities of the fine-tuned model, even with good synthetic coverage of on-policy errors. Our results help explain why on-policy reinforcement learning methods have proven uniquely effective for eliciting self-correction.
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.75)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)
DeepXPalm: Tilt and Position Rendering using Palm-worn Haptic Display and CNN-based Tactile Pattern Recognition
Miguel, Altamirano Cabrera, Oleg, Sautenkov, Jonathan, Tirado, Aleksey, Fedoseev, Pavel, Kopanev, Hiroyuki, Kajimoto, Dzmitry, Tsetserukou
Telemanipulation of deformable objects requires high precision and dexterity from the users, which can be increased by kinesthetic and tactile feedback. However, the object shape can change dynamically, causing ambiguous perception of its alignment and hence errors in the robot positioning. Therefore, the tilt angle and position classification problem has to be solved to present a clear tactile pattern to the user. This work presents a telemanipulation system for plastic pipettes consisting of a multi-contact haptic device LinkGlide to deliver haptic feedback at the users' palm and two tactile sensors array embedded in the 2-finger Robotiq gripper. We propose a novel approach based on Convolutional Neural Networks (CNN) to detect the tilt and position while grasping deformable objects. The CNN generates a mask based on recognized tilt and position data to render further multi-contact tactile stimuli provided to the user during the telemanipulation. The study has shown that using the CNN algorithm and the preset mask, tilt, and position recognition by users is increased from 9.67% using the direct data to 82.5%.
- Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)
- North America > United States > New York > New York County > New York City (0.04)
- North America > Costa Rica > Heredia Province > Heredia (0.04)
- Europe > Russia > Central Federal District > Moscow Oblast > Moscow (0.04)
- Research Report > New Finding (0.48)
- Research Report > Experimental Study (0.48)
FiDTouch: A 3D Wearable Haptic Display for the Finger Pad
Trinitatova, Daria, Tsetserukou, Dzmitry
--The applications of fingertip haptic devices have spread to various fields from revolutionizing virtual reality and medical training simulations to facilitating remote robotic operations, proposing great potential for enhancing user experiences, improving training outcomes, and new forms of interaction. In this work, we present FiDT ouch, a 3D wearable haptic device that delivers cutaneous stimuli to the finger pad, such as contact, pressure, encounter, skin stretch, and vibrotactile feedback. The application of a tiny inverted Delta robot in the mechanism design allows providing accurate contact and fast changing dynamic stimuli to the finger pad surface. The performance of the developed display was evaluated in a two-stage user study of the perception of static spatial contact stimuli and skin stretch stimuli generated on the finger pad. The proposed display, by providing users with precise touch and force stimuli, can enhance user immersion and efficiency in the fields of human-computer and human-robot interactions. Fingertip haptic devices (FHD) enrich the user experience in the realm of human-computer and human-robot interaction, bridging the gap between the digital and physical worlds by providing various cutaneous and force feedback directly to the user's fingertips. The ability to accurately reproduce the feeling of grasping in a virtual or remote environment is essential for creating a realistic experience in Virtual Reality, teleoperation, and telexistence, since finger pads are used for interactions with physical objects and probing the environment in most cases.
- Europe > Russia > Central Federal District > Moscow Oblast > Moscow (0.04)
- Asia > Russia (0.04)
Leveraging Large Language Models for Explainable Activity Recognition in Smart Homes: A Critical Evaluation
Fiori, Michele, Civitarese, Gabriele, Choudhary, Priyankar, Bettini, Claudio
Explainable Artificial Intelligence (XAI) aims to uncover the inner reasoning of machine learning models. In IoT systems, XAI improves the transparency of models processing sensor data from multiple heterogeneous devices, ensuring end-users understand and trust their outputs. Among the many applications, XAI has also been applied to sensor-based Activities of Daily Living (ADLs) recognition in smart homes. Existing approaches highlight which sensor events are most important for each predicted activity, using simple rules to convert these events into natural language explanations for non-expert users. However, these methods produce rigid explanations lacking natural language flexibility and are not scalable. With the recent rise of Large Language Models (LLMs), it is worth exploring whether they can enhance explanation generation, considering their proven knowledge of human activities. This paper investigates potential approaches to combine XAI and LLMs for sensor-based ADL recognition. We evaluate if LLMs can be used: a) as explainable zero-shot ADL recognition models, avoiding costly labeled data collection, and b) to automate the generation of explanations for existing data-driven XAI approaches when training data is available and the goal is higher recognition rates. Our critical evaluation provides insights into the benefits and challenges of using LLMs for explainable ADL recognition.
- Europe > Italy > Lombardy > Milan (0.04)
- Oceania > Australia > Victoria > Melbourne (0.04)
- North America > United States > New York > New York County > New York City (0.04)
- Asia > Myanmar > Tanintharyi Region > Dawei (0.04)
LLM-Glasses: GenAI-driven Glasses with Haptic Feedback for Navigation of Visually Impaired People
Tokmurziyev, Issatay, Cabrera, Miguel Altamirano, Khan, Muhammad Haris, Mahmoud, Yara, Moreno, Luis, Tsetserukou, Dzmitry
Abstract-- We present LLM-Glasses, a wearable navigation system designed to assist visually impaired individuals by combining haptic feedback, YOLO-World object detection, and GPT-4o-driven reasoning. The system delivers real-time tactile guidance via temple-mounted actuators, enabling intuitive and independent navigation. Three user studies were conducted to evaluate its effectiveness: (1) a haptic pattern recognition study achieving an 81.3% average recognition rate across 13 distinct patterns, (2) a VICON-based navigation study in which participants successfully followed predefined paths in open spaces, and (3) an LLM-guided video evaluation demonstrating 91.8% accuracy in open scenarios, 84.6% with static obstacles, and 81.5% with dynamic obstacles. These results demonstrate the system's reliability in controlled environments, with ongoing work focusing on refining its responsiveness and adaptability to diverse real-world scenarios. LLM-Glasses showcases the potential of combining generative AI with haptic interfaces to empower visually impaired individuals with intuitive and effective mobility solutions.
- North America > Costa Rica > Heredia Province > Heredia (0.04)
- Africa > Central African Republic > Ombella-M'Poko > Bimbo (0.04)
Perception of Emotions in Human and Robot Faces: Is the Eye Region Enough?
Mishra, Chinmaya, Skantze, Gabriel, Hagoort, Peter, Verdonschot, Rinus
The increased interest in developing next-gen social robots has raised questions about the factors affecting the perception of robot emotions. This study investigates the impact of robot appearances (humanlike, mechanical) and face regions (full-face, eye-region) on human perception of robot emotions. A between-subjects user study (N = 305) was conducted where participants were asked to identify the emotions being displayed in videos of robot faces, as well as a human baseline. Our findings reveal three important insights for effective social robot face design in Human-Robot Interaction (HRI): Firstly, robots equipped with a back-projected, fully animated face - regardless of whether they are more human-like or more mechanical-looking - demonstrate a capacity for emotional expression comparable to that of humans. Secondly, the recognition accuracy of emotional expressions in both humans and robots declines when only the eye region is visible. Lastly, within the constraint of only the eye region being visible, robots with more human-like features significantly enhance emotion recognition.
- North America > United States > California > Los Angeles County > Los Angeles (0.14)
- Europe > Sweden > Stockholm > Stockholm (0.04)
- Europe > Netherlands > Limburg > Maastricht (0.04)
- (2 more...)
- Research Report > New Finding (1.00)
- Questionnaire & Opinion Survey (1.00)
- Research Report > Experimental Study > Negative Result (0.46)
- Health & Medicine (1.00)
- Media > Film (0.93)
- Leisure & Entertainment (0.68)
- Information Technology > Artificial Intelligence > Vision > Face Recognition (1.00)
- Information Technology > Artificial Intelligence > Robots (1.00)
- Information Technology > Artificial Intelligence > Issues > Social & Ethical Issues (1.00)
- Information Technology > Artificial Intelligence > Cognitive Science > Emotion (1.00)
TiltXter: CNN-based Electro-tactile Rendering of Tilt Angle for Telemanipulation of Pasteur Pipettes
Cabrera, Miguel Altamirano, Tirado, Jonathan, Fedoseev, Aleksey, Sautenkov, Oleg, Poliakov, Vladimir, Kopanev, Pavel, Tsetserukou, Dzmitry
The shape of deformable objects can change drastically during grasping by robotic grippers, causing an ambiguous perception of their alignment and hence resulting in errors in robot positioning and telemanipulation. Rendering clear tactile patterns is fundamental to increasing users' precision and dexterity through tactile haptic feedback during telemanipulation. Therefore, different methods have to be studied to decode the sensors' data into haptic stimuli. This work presents a telemanipulation system for plastic pipettes that consists of a Force Dimension Omega.7 haptic interface endowed with two electro-stimulation arrays and two tactile sensor arrays embedded in the 2-finger Robotiq gripper. We propose a novel approach based on convolutional neural networks (CNN) to detect the tilt of deformable objects. The CNN generates a tactile pattern based on recognized tilt data to render further electro-tactile stimuli provided to the user during the telemanipulation. The study has shown that using the CNN algorithm, tilt recognition by users increased from 23.13\% with the downsized data to 57.9%, and the success rate during teleoperation increased from 53.12% using the downsized data to 92.18% using the tactile patterns generated by the CNN.
- North America > United States > Rhode Island > Newport County > Newport (0.04)
- North America > Costa Rica > Heredia Province > Heredia (0.04)
- Europe > Russia > Central Federal District > Moscow Oblast > Moscow (0.04)
- (2 more...)
- Research Report > New Finding (0.96)
- Research Report > Experimental Study (0.71)
- Government > Regional Government (0.47)
- Health & Medicine (0.46)
A Prior Embedding-Driven Architecture for Long Distance Blind Iris Recognition
Xiong, Qi, Zhang, Xinman, Shen, Jun
Blind iris images, which result from unknown degradation during the process of iris recognition at long distances, often lead to decreased iris recognition rates. Currently, little existing literature offers a solution to this problem. In response, we propose a prior embedding-driven architecture for long distance blind iris recognition. We first proposed a blind iris image restoration network called Iris-PPRGAN. To effectively restore the texture of the blind iris, Iris-PPRGAN includes a Generative Adversarial Network (GAN) used as a Prior Decoder, and a DNN used as the encoder. To extract iris features more efficiently, we then proposed a robust iris classifier by modifying the bottleneck module of InsightFace, which called Insight-Iris. A low-quality blind iris image is first restored by Iris-PPRGAN, then the restored iris image undergoes recognition via Insight-Iris. Experimental results on the public CASIA-Iris-distance dataset demonstrate that our proposed method significantly superior results to state-of-the-art blind iris restoration methods both quantitatively and qualitatively, Specifically, the recognition rate for long-distance blind iris images reaches 90% after processing with our methods, representing an improvement of approximately ten percentage points compared to images without restoration.
- Asia > China > Shaanxi Province > Xi'an (0.04)
- Oceania > Australia > New South Wales > Wollongong (0.04)
- Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)
MoveTouch: Robotic Motion Capturing System with Wearable Tactile Display to Achieve Safe HRI
Alabbas, Ali, Cabrera, Miguel Altamirano, Sayed, Mohamed, Alyounes, Oussama, Liu, Qian, Tsetserukou, Dzmitry
The collaborative robot market is flourishing as there is a trend towards simplification, modularity, and increased flexibility on the production line. But when humans and robots are collaborating in a shared environment, the safety of humans should be a priority. We introduce a novel wearable robotic system to enhance safety during Human-Robot Interaction (HRI). The proposed wearable robot is designed to hold a fiducial marker and maintain its visibility to a motion capture system, which, in turn, localizes the user's hand with good accuracy and low latency and provides vibrotactile feedback to the user's wrist. The vibrotactile feedback guides the user's hand movement during collaborative tasks in order to increase safety and enhance collaboration efficiency. A user study was conducted to assess the recognition and discriminability of ten designed vibration patterns applied to the upper (dorsal) and the down (volar) parts of the user's wrist. The results show that the pattern recognition rate on the volar side was higher, with an average of 75.64% among all users. Four patterns with a high recognition rate were chosen to be incorporated into our system. A second experiment was carried out to evaluate users' response to the chosen patterns in real-world collaborative tasks. Results show that all participants responded to the patterns correctly, and the average response time for the patterns was between 0.24 and 2.41 seconds.
- North America > Costa Rica > Heredia Province > Heredia (0.04)
- Europe > Russia > Central Federal District > Moscow Oblast > Moscow (0.04)
- Asia > China > Liaoning Province > Dalian (0.04)
- (2 more...)
FlockGPT: Guiding UAV Flocking with Linguistic Orchestration
Lykov, Artem, Karaf, Sausar, Martynov, Mikhail, Serpiva, Valerii, Fedoseev, Aleksey, Konenkov, Mikhail, Tsetserukou, Dzmitry
This article presents the world's first rapid drone flocking control using natural language through generative AI. The described approach enables the intuitive orchestration of a flock of any size to achieve the desired geometry. The key feature of the method is the development of a new interface based on Large Language Models to communicate with the user and to generate the target geometry descriptions. Users can interactively modify or provide comments during the construction of the flock geometry model. By combining flocking technology and defining the target surface using a signed distance function, smooth and adaptive movement of the drone swarm between target states is achieved. Our user study on FlockGPT confirmed a high level of intuitive control over drone flocking by users. Subjects who had never previously controlled a swarm of drones were able to construct complex figures in just a few iterations and were able to accurately distinguish the formed swarm drone figures. The results revealed a high recognition rate for six different geometric patterns generated through the LLM-based interface and performed by a simulated drone flock (mean of 80% with a maximum of 93\% for cube and tetrahedron patterns). Users commented on low temporal demand (19.2 score in NASA-TLX), high performance (26 score in NASA-TLX), attractiveness (1.94 UEQ score), and hedonic quality (1.81 UEQ score) of the developed system. The FlockGPT demo code repository can be found at: coming soon
- North America > United States > New York (0.04)
- Asia > Middle East > Jordan (0.04)
- Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles > Drones (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.35)